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Abstract 

In a network where the cost of flow across an edge is nonlinear in the 
volume of flow, and where sources and destinations are uniform, one 
can consider the relationship between total volume v of flow through 
the network and the minimum cost c = ^(v) of any flow with volume v. 
Under a simple probability model (locally tree-like directed network, 
independent cost- volume functions for different edges) we show how to 
compute ty(v) in the infinite-size limit. The argument uses a proba- 
bilistic reformulation of the cavity method from statistical physics, and 
is not rigorous as presented here. The methodology seems potentially 
useful for many problems concerning flows on this class of random 
networks. 

Key words: cavity method, network flow, probability model. 
MSC2000 subject classification: Primary: 90B15 
OR/MS subject classification: Primary: Networks/graphs 



*Research supported by N.S.F. Grant DMS0203062 



1 Introduction 



The time ( "cost" ) it takes you to drive a given segment of road depends on 
the amount ( "volume" ) of traffic, increasing as volume increases up to some 
critical value at which the road becomes jammed. So there's a cost-volume 
curve for each road segment. Now consider the road network of a city, with 
many vehicles simultaneously travelling from different "sources" to different 
destinations, using minimum-cost routes depending on congestion pattern. 
As we linearly scale the overall volume of traffic, the average cost-per- vehicle 
will also increase as volume increases, up to some critical value at which the 
network becomes jammed. So there's a cost-volume curve for the network 
as a whole (depending also on the source-destination pattern). 

This paper gives a foundational mathematical study of the idea above, 
in an artificially simple setting. One can view this topic as akin to statistical 
physics: we seek to understand how the "macroscopic" behavior of the net- 
work (the network cost-volume function) emerges from the "microscopic" 
specification (a probability distribution on edge cost-volume functions and 
a probability distribution on network topology). And our methodology is a 
recent probabilistic reformulation of the cavity method of statistical physics. 
Apparently this is the first paper to apply such methodology to explicitly 
"network flow" problems, and it is plausible that a broader range of prob- 
lems than treated here could be studied by the same methodology, albeit 
with some intrinsic caveats noted in section lL2l 

Of course, the study of flows in networks is a centerpiece of classical 
Operations Research and has evident applications in several Engineering 
disciplines jl . But we don't know any work which is closely related to 
the present paper, so we will defer literature discussion until a later survey 
paper intended to present a much broader view of the topic of flows through 
random networks. We should emphasize that we are discussing deterministic 
flows on random networks, in contrast to queueing theory which studies 
random flows on deterministic networks: see for a brief survey of routing 
questions within that setting. 

1.1 A network model 

The random layer graph model. Take M layers, each with N vertices. 
For each 1 < i < M — 1 create directed edges from some vertices in layer 
i to some vertices in layer i + 1. The choice of edges is uniform random, 
subject to the constraint 

each layer-i vertex has out-degree 2, and each layer-(i + 1) vertex 
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has in-degree 2. 

This defines a random graph with MN vertices w and with 2{M — 1)N 
directed edges e. See Figure 1. 




Figure 1. A realization of the random layer graph with M = 4, N = 6. 

For our purposes the key feature of this model is that as M, N — > oo the 
sequence of random layer graphs satisfies local weak convergence to the in- 
finite tree T in which each vertex has in-degree 2 and out-degree 2. Local 
weak convergence 0] means: 

Take a uniform random vertex to be a root of the n- vertex graph. 
As n — > oo, the subgraphs on vertices within an arbitrary fixed 
distance (number of edges) from the root converge in distribution 
to the corresponding subgraph of the limit T, considered as a 
rooted graph. 

Now suppose that on each edge e of the random layer graph there is a 
function ($(e, v), v > 0) representing the cost of a flow of volume v across 
e. Equivalently, consider the cost-per-unit-flow <p(e,v) = v~ 1 ^(e,v). (Note: 
the mathematics works more cleanly with "total cost" functions like <1> and 
below, but the interpretation is more intuitive in terms of cost-per-unit- 
volume functions (p and tp.) Suppose we wish to send flow of volume vm,n 
through the network, i.e. from layer 1 to layer M, along directed edges. 
Each possible such "global flow" has some total cost, and so one can seek 
to study the minimum total cost as a function of volume vm,n, under some 
model of edge-costs. 

The edge-cost model. Fix a probability distribution on functions <3?(v) 
(equivalently: on functions <j)(v) = f _1 $(v)). For each edge e of the ran- 
dom layer graph, let $>(e,v) be chosen independently from this probability 
distribution. 
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Discussing M, N — > oo limits involves scaling conventions, whose details 
we specify here but which (as described below) are easily interpretable with- 
out these details. Because there are 2N edges between successive layers, the 
typical flow per edge will be order vm,n /(2iV). We therefore take "standard- 
ized volume" < v < oo and set vm,n = 2Nv. With the resulting order 1 
flows through edges, the total cost will scale as the number 2N{M — 1) of 
edges. Thus we define standardized cost of the optimal flow with standardized 
volume v to be 

^M,n(v) = 2N(M-l) ( mm i ma l cos t over flows of volume 2Mv through the network). 

The function ^m,n(v) is random because it depends on the realizations of 
the graph and of edge-flow functions, but by virtue of the standardization 
we expect a deterministic limit function 

^m,n(v) — > 9(v) in probability, < v < oo 

as M, N — > oo with not too dissimilar orders of magnitude. Set ip{v) = 
To interpret the limit function more intuitively, consider as a 
benchmark the "uniform" flow of constant volume v along each edge. This 
has normalized volume v and limit normalized cost Ecj>(v). The purpose of 
the standardizations is simply to be able to compare cost of the optimal flow 
of given volume in our model with the cost of the uniform flow of the same 
volume. 

The setting where edges have some finite capacity (maximum allowed 
volume) fits our setup by taking $>(v) = oo for v larger than the capacity. 
In this case we expect the network has some finite maximum standardized 
volume v*: 

tff(v) < oo, V < V* 

= oo, V > V* . (I) 

Note that v* will not depend on edge-costs, just on edge-capacities. 
1.2 Methodology 

The purpose of this paper is to point out that it is indeed possible to analyze 
the model above. That is, one can via theoretical arguments obtain the 
limit network cost-volume function ^(v) for a given distribution of edge 
cost- volume functions. The results are presented in section [21 in a variety of 
particular cases and specializations. 
To be upfront about the caveats: 
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• The arguments in this paper are non-rigorous. 

• The methodology deals with limits as number of vertices grows to 
infinity, and is only useful when the underlying graphs are "locally tree- 
like" (local weak convergence to some infinite tree, maybe random). 

• Getting explicit results involves numerical solution of a fixed-point 
equation (RDE) for an unknown probability distribution. 

While the latter two caveats are intrinsic to the methodology, the first is more a 
technical matter: we understand conceptually what steps are needed to make a 
rigorous proof, but implementing details of some of the steps in general settings 
seems astronomically far out of reach of current theory. A high-level description of 
the methodology, which we describe as "probabilistic reformulation of the cavity 
method", can be found in |3] section 7.5. In this paper our focus is on exhibiting 
the calculations (section [3J and their results in the network flow setting, without 
attempting rigorous justification. However, one of our simpler examples (maximum 
density of edge-disjoint infinite paths in a randomly obstructed infinite tree; section 
12.4(1 provides an appealing benchmark problem for future development of rigorous 
proofs. 

We postpone further discussion until sections 13.31 and [5] 

2 Results 

Within the model of section ITTl we will describe the network cost-per-unit- 
volume curve c = ip(v) in five examples at varying levels of generality. How 
these results are derived will be explained in section |31 

2.1 A traffic flow model 

The real-world relationship between traffic speed and traffic density has of 
course been studied in detail; see [B] for an introduction to this theory. Let 
us take the most naive model in which speed s is a decreasing linear function 
of traffic density p: 

s = so(l — ap). 

Note that our flow volume v equals sp. This model implies there is a max- 
imum possible flow volume, attained at speed sq/2. In our setting, "cost" 
c is traversal time, that is proportional to 1/s. Solving for c in terms of v 
gives the cost-volume function for an edge: 

l_(l_^L)l/2 

c = <j>(v) = c -j— ; v < w* (2) 

= OO, V > 
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where cq = <^(0+) is the cost-per-unit- volume at the zero volume limit, and 
w* is the maximum volume. Also, the cost-per-unit-volume at maximum 
flow equals 2cq. 

To make a probability model we take en = 1 and let u>*(e) be independent 
over edges e with Exponential(l) distribution. Figure 2 shows the network 
cost-per-unit-volume curve c = ip(v). 




0.0 0.2 0.4 0.6 0.8 1.0 



volume (v) 

Figure 2. The long curve is the edge cost-per-unit-volume function c = <j>(v) 
at J2j with Co = Wit = 1. The short curve is the network cost-per-unit-volume 
function c = ?Jj(v). Numerical results from bootstrap Monte Carlo solution of the 
RDE (|1 1|) . Irregularities are artifacts of sampling variation, as explained in section 

cm 

Because each edge e has 4>(e, 0) = 1 we obviously have vp(0+) = 1. The max- 
imum normalized volume of network flow is numerically about 0.34 and the 
corresponding cost-per-unit-volume is numerically about 1.33. The network 
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cost-volume curve has the same qualitative shape as the edge cost-volume 
curve. 

2.2 Capacity constraints 

As mentioned before, the case where edges e have a maximum capacity K(e) 
can be fitted into our framework by assigning infinite cost to larger flows. 

1.0- 
0.9- 
0.8- 
0.7- 
0.6- 

cost 

(c) 0.5- 
0.4- 
0.3- 
0.2- 
0.1- 

0.1 0.2 0.3 0.4 

volume (v) 

Figure 3. The network cost-per-unit-flow function c = ip(v), in the case 
where the edge cost-per-unit-flow is a constant C(e) up to a capacity K{e), where 
(C(e), K(e)) are independent Exponential (1) as e varies. Numerical results from 
bootstrap Monte Carlo solution of the RDE (|llfl . Irregularities are artifacts of 
sampling variation, as explained in section 13.71 

Taking cost-per-unit-flow to be constant up to the capacity gives 

<j>(e,v) = C(e) 0<v<K(e) 
= oo v > K(e) 

where (C(e),K(e)) are i.i.d. as e varies. We treat the example where C(e) 
and K[e) are independent with Exponential 1) distribution. Figure 3 shows 
the network cost-per-unit volume function ip(v). 

Some aspects of this curve are understandable by theory. Specializing a 
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general large deviation result II22() to the Exponential(l) case shows that 

V>(0+) w 0.23196 is the solution of x - logx = 1 + log 2. (3) 

The maximal volume v* must be the same as in the previous example (nu- 
merically, about 0.34). Because the cost-per-unit-flow on an edge is inde- 
pendent of edge capacity and has mean 1, we must have ip(v*) = 1. 



2.3 Unit edge capacities and the scaling exponent in mean- 
field first passage percolation 

This is the first of two specializations in which we take the edge-capacities to 
be constant (K = 1); section 13*31 explains how this leads to some mathemat- 
ical simplification. In this section, we specialize the model of the previous 
section to the case K (e) = 1 of constant edge capacities. That is, 

<f>(e,v) = C(e) 0<v<l (4) 

= OO V > 1 

where the C(e) are i.i.d. with Exponential 1) distribution. As in the pre- 
vious section, we know from (|22j) that the low-volume limit of the network 
cost function is 

c = ip(v) | ^(0+) ?« 0.23196 as v i 0. 

Here we examine in more detail the cost-volume curve in the low-flow regime. 
Rewrite ^(0+) as c FPP , to emphasize its interpretation as the time constant 
for first passage percolation (see section "4.1JI . To make an analogy below 
with percolation functions, we consider the inverse function v = V _1 ( c ) 
giving volume as a function of cost-per-unit-volume. Table 1 gives numerical 
results in the low-flow regime. 

A 0.280 0.300 0.320 0.340 0.360 0.380 

cost c 0.267 0.279 0.290 0.302 0.313 0.327 

volume v 0.013 0.027 0.046 0.067 0.086 0.109 

12.7(c - c FP p) 2 0.015 0.028 0.043 0.063 0.084 0.115 

Table 1. Volume and cost-per-unit-volume relationship for model @ in the 
low volume regime. Numerical results from bootstrap Monte Carlo solution of the 
RDE {T5J, showing a good fit to v = V' _1 (c) ~ 12.7(c— c FPP ) 2 . The A is a parameter 
used to construct c as an implicit function of v. 
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Recall _9 that in classical site or bond percolation on 7L d there is a 'percolation 
function 

f(p) = P( origin in some infinite component) 

where p is the underlying probability on sites or bonds. There is a critical 
point p* at which /(•) becomes non-zero, and considerable attention has 
been paid to scaling exponents 

f(p) - {P-P*T as pip*- 

We do not know any parallel discussion in the setting of first passage perco- 
lation, but our setup suggests one possible formulation. In the model above, 
the inverse function v = ^ _1 (c) of c = ip(v) satisfies (Table 1) 

i>-\c) x (c-c F pp) 2 . (5) 

Our model can be viewed as the "mean-field" analog of oriented first pas- 
sage bond percolation on Z d . In the latter model, giving edges capacity 
1 and interpreting the random edge-traversal times as costs, we can define 
a cost-volume curve as in this paper, and presumably one gets dimension- 
dependent scaling exponents in @. This seems an interesting, though dif- 
ficult, topic for future research. 

2.4 Maximal flow through the randomly obstructed networks 

Fix 1/2 < p < 1. Consider the case where edge-costs are constant (C = 1) 
and where the edge-capacities are either or 1: 

p(K = l)=p; P(K = 0) = l-p. 

In other words, a proportion 1 — p of edges are obstructed and permit zero 
flow. In this model, the cost- volume curve is not an issue, since normal- 
ized cost per unit volume is just 1. However, it is natural to ask how the 
maximum normalized volume v* = v*(p) at behaves as a function of p. 

Note that we may reformulate the model by taking K = 1 (all edges 
present with unit capacity) and taking 

p(C=l)=p; P(C = oo) = l-p (6) 

which has the same effect of eliminating from consideration a proportion 
1 — p of edges. As mentioned before, the case of constant edge-capacity is 
mathematically simpler. 
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The curve v*(p) is shown in Figure 4. The qualitative endpoint behavior 
observed numerically is not hard to understand theoretically - see section 
PI 



1.0- 

0.9- 

0.8- 
P 

0.7- 
0.6- 

0.5 H 1 1 1 1 1 1 1 1 1 1 

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

normalized flow v*(p) 

Figure 4. The relationship between p and v*(p) in the randomly obstructed 
network model. The curve was obtained by bootstrap Monte Carlo solution of the 
RDE lO. T^ endpoint behavior is v*(p) w 0.76(p- 1/2) 2 asp | 1/2; 1- v*(p) » 
1.56(1 -pjlog^) aspt 1. 

2.5 Quadratic costs 

Consider the case 

$>(v,e) = K(e)v 2 that is (f)(v,e) = n(e)v 

where n{e) is i.i.d. over edges e. This has the obvious scaling property that 
if f has v({) = Vo, c(f) = Co then a scaled flow af has v(af) = avo, c(af ) = 
a 2 CQ. So the network cost-per-unit- volume function must be of the form 

C = 1p(v) = KV 

where R depends on the distribution of n(e). Our normalization convention 
ensures 

if P(As(e) = 1) = 1 then R = 1 . 
10 




Figure 5 shows numerical results in the case where n(e) has Gamma(a, a) 
distribution (recall this has mean a and standard deviation a -1 / 2 ). In the 
a — > oo limit we have «(e) = 1 and so R = 1. The "myopic" flow with 
volume 1 across each edge always has normalized cost 1. As a decreases, 
the variability of re(e) increases and this causes the normalized cost R of the 
optimal flow to decrease, because flow can take advantage of cheaper edges. 



1.0 -. 
0.9- 
0.8- 
0.7- 
0.6- 



: * * * * 



*** ***** 



***** **** 



1.0 2.0 a 3.0 4.0 5.0 

Figure 5. The case of quadratic costs with Gamma(o, a) distribution, 1 < 
a < 5. The curve shows R(a) as a function of a. Numerical results from bootstrap 
Monte Carlo solution of the RDE (|ll|l . Irregularities are artifacts of sampling 
variation, as explained in section TS. 71 

Implementing the optimal flow through the network would require central- 
ized routing policy. It is natural to compare this to decentralized routing 
schemes, and (viewing the network as traffic flow on an infinite tree) the nat- 
ural decentralized scheme is to have customers leaving each vertex choose 
the cheaper out-edge to traverse next (this of course depends on the flows 
of other customers). This customer driven scheme turns out to be compar- 
atively simple to analyze (in the infinite tree limit). Note that, compared 
to the myopic scheme, the customer driven scheme benefits from being able 
to put more flow through the cheaper out-edge; on the other hand, the fact 
that the volume of flow through different vertices is non-uniform will tend 
(in the "quadratic" setting) to increase costs. Working through the analy- 
sis (section 14.21) gives the remarkable conclusion that in the present setting 
(quadratic costs; Gamma distribution of «(e)) the normalized cost of the 
customer driven scheme is exactly 1, the same as the myopic scheme. We 
have no non-calculational explanation of this intriguing result. Also, in this 
setting the applicability of infinite-tree analysis to finite network problems 
is somewhat problematical. 
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3 Implementing the cavity method 



3.1 The infinite network model 

As mentioned before, the key property of the random layer graph is its local 
weak convergence to the infinite tree T = (V, E) with directed edges, in 
which each vertex has in-degree 2 and out-degree 2. One vertex of T is 
distinguished as the root. See Figure 6 later. The essence of the method 
is that one can do exact calculations within the infinite model which one 
expects to give the correct n — » oo asymptotics for the finite models. Making 
the connection rigorous is a challenge we do not address here, except for brief 
comments in section EP1 Instead we focus on exhibiting the calculations. 
First we copy our finite random network model to 

The infinite network model. Fix a probability distribution on functions 
(equivalently: on functions <j>{v) = v~ l <&{v)). For each r and each 
edge e of the infinite tree T, let $(e,u) be chosen independently from this 
probability distribution. 

A flow f = (/(e)) in the infinite network is required only to satisfy the 
"in-flow equals out-flow" condition at each vertex: there are no sources 
and destinations (think of flows from and to infinitely distant boundaries). 
Intuitively, "normalized volume of flow" v(f ) is the average flow per edge over 
the infinite network. It is more convenient to interpret this, via the ergodic 
principle, as the expected value of the flow through a typical edge, when 
we require flows to be invariant. Roughly (see 0] for further discussion) 
invariant means that the joint distribution of flow and edge-capacities and 
edge-costs is not dependent on the choice of root vertex. In particular, for 
an invariant flow f the quantity 

w(f) = E[f(e)} 

does not depend on choice of edge e. This quantity v(f) is our definition 
of normalized volume of the flow f . Similarly we define the normalized cost 
associated with a flow as 

c(f)=^[*( e> /(e))] 

where again the choice of e does not matter. Then we study the cost- volume 
relationship described by the curve c = ip{v): 

ip(v) = min{c(f) : f an invariant flow with v(f) = v} 
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defined for 



< v < v* = max{v({) : f an invariant flow} < oo. 



(7) 



3.2 Outline of methodology 

We are dealing with a minimization-under-constraint problem, so it is nat- 
ural to introduce a Lagrange multiplier A > and consider the problem 
conceptually as 



We analyze this problem on the infinite network T as outlined below. 

Step 1. Relative to a reference edge e*, the tree T splits into two 
statistically similar rooted trees T" 1 " and T . 

Step 2. On T + consider 



measured relative to the v = case. 

Step 3. T + recursively decomposes into three subtrees, statistically 
similar to each other and to T + . The process X(v) is deterministically 
related to the corresponding quantities Xi{v) on the subtrees, and the func- 
tions <&(ej,t>) on adjacent edges ej. This implies that the distribution of 
(X(v),v > 0) satisfies a certain recursive distributional equation (RDE), 
equation 

Step 4. The flow f(e*) across e* in the flow f optimizing (JSJ) is now 
determined by the processes (X + (v),v > 0) and (X~(v),v > 0) on T + and 
T . 

Step 5. From this optimal flow fx we calculate normalized cost c(f\) 
and normalized volume v(f\) which then determine the cost-volume curve. 

3.3 Discussion of methodology 

(a) The logic of why we expect this method to give correct answers is some- 
what complicated: here we rephase the outline from [3| sec. 7.5 (see also 
|3] sec. 5). Firstly, even though the definition of X(v) is non-rigorous (the 
quantity (jHJ) equals oo — oo), a solution of the RDE ifTTj) can be used to 
process a T-indexed invariant random process (X e (v)) with this solution as 
marginal distribution. In turn this process can be used to define a flow on 
T, and the argument which derives the RDE can (one hopes) be recycled 
into an argument that this flow is indeed the optimal flow on the infinite 



minimize (cost of flow) - Ax (volume of flow) . 



(8) 



X(v) 



minimum of (jHJ) over flows with f(e*) = v 
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tree. Identifying this infinite-network optimal flow as the limit of finite- 
network optimal flows is the second issue. Local weak convergence implies 
that subsequential weak limits of optimal finite-n flows are feasible flows on 
T, but the issue is to show that from the optimal flow on T one can synthe- 
size near-optimal flows on the finite networks. One needs to show that the 
T-indexed processes has a certain "trivial tail a-field' property (discussed 
carefully in |3j under the name endogeny) . This implies the optimal flow on 
an edge is a measurable function of the random edge cost-volume functions 
on other edges. This enables one to construct quasi-flows (which almost 
satisfy the balance requirement at each vertex) on the finite networks, so 
then one needs to show that quasi-flows can be converted to genuine flows 
with negligible extra cost. 

(b) This methodology is fundamentally the same idea as the non-rigorous 
cavity method developed in the 1980s in the study of statistical physics mod- 
els of disordered systems. See [H] for the recent survey most useful for our 
purposes. Though intended primarily for study of "interacting particle" 
physics models, it was noted in the 1980s ^3] that these methods could be 
applied also to combinatorial optimization problems (matching, traveling 
salesman) on random points in an artificial "mean field" model of geometry 
(complete graph with independent random edge lengths), and recently have 
been applied to problems such as random K-SAT ^3J. Rigorizing cavity 
method arguments in combinatorial optimization is a project of contempo- 
rary interest in theoretical probability, as yet carried through in only two 
hard problems: see [2] for the mean-field matching problem and for some 
random graph questions. Our example in section 12.41 (maximum density 
of edge-disjoint infinite paths in a randomly obstructed infinite tree) seems 
a natural next problem for rigorous study. But in this paper we focus on 
demonstrating the range of applicability of the non-rigorous methodology to 
network flow problems. We remark that the third issue in (a) is particular 
to the network flow setting, so has not been studied in previous work. 

(c) In most examples we don't expect to be able to find an explicit 
analytic solution of the RDE; instead we use bootstrap Monte Carlo (section 
13 .7j) to approximate the solution and derive the numerical results shown 
in section [2 The theoretical issue of proving uniqueness of solutions is 
often difficult. In the examples in this paper, we always take <j)(e, v) to be 
non-decreasing in v, so that $(e, v) is convex in v. By analogy with the 
deterministic setting (where a convex function attains its minimum at a 
unique point) one might expect convexity to imply uniqueness of solutions 
of RDEs, but we do not see any simple general argument. 

(d) RDEs are at the center of this formulation of the cavity method. As 
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well as their appearance in these kind of mean field (disordered network) 
optimization problems, they arise in a broad range of applied probability 
problems, as illustrated in the survey PJ. 



3.4 The general case 



We now show how to implement the section 13721 methodology in the general 
infinite network model of section 13.11 

Fix an edge e* = (w_,w + ) in T. (We use w to denote a vertex, since 
we are using v for volume). Delete the other edges at w- and write T + = 
(V + ,E + ) for the component containing w+; this is an infinite tree with the 
same properties as T except that the distinguished vertex w- has out-degree 
1 and in-degree 0. See Figure 6. 



T+ 



W- 



w + 



Figure 6. The tree T + and its recursive decomposition into T!,T2,T 3 . 

Fix a realization of edge cost-volume functions (3>(e, v), e E E + \ {e*}). Let 
T + be the set of flows f on E + which satisfy the balance constraints at each 
vertex except W-. For < v < oo define 



X(v) = inf Y, (HeJ(e))-Xf(e)) 
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inf £ (d>(e,/(e))-A/(e)). (9) 

fe^ + :/(e*)=O eeE+je ^ 

As written, one cannot make rigorous sense of these infinite sums. The 
heuristic idea is to interpret each sum asar->oo limit of 

(sum over e within distance r from e*) — a r 

for normalizing constants a r , and then the constants cancel when we subtract 
to compare the v > case with the v = case. Conceptually, X(v) measures 
the relative effect of insisting that the flow through e* be exactly v. 

We next derive the recursion for X(v). Recall e* = (u;_,u>+) is the 
distinguished edge in T + . Write e\ for the other edge directed into w + , and 
write e\ , e\ for the two edges directed out of w + . By cutting at w+ , we can 
decompose T + into three subtrees Tj, T2, T3 and the single edge e*, where 
each Tj contains e*. Each Tj is isomorphic to T + (with edge-reversal, in 
the case of Ti), and has a distinguished edge e* with w+ as the exceptional 
vertex isomorphic to W-. See Figure 6. 

On each Tj define Xi{y) as at (jSJ. We will show 

3 

X(v) = inf Y^melv^-Xvi + X^Vi)) 

i=l 

3 

inf VVSfoVO-A^ + X^)) (10) 

Vl=V 2 +V 3 ' 

i=l 

To derive this equality, rewrite © as 

X(v) = X(v)-X(0). 

In a flow f on T + with /(e*) = v, the flows /(ej) = v\, /(e|) = U2, /(e-j) = ^3 
must satisfy v + v\ = v 2 + U3 . For a given value of the contribution to the 
sum in @ from edges in Tj equals 

$(e*, Vi ) - \vi + Xi{vi) 

because we obviously choose the optimal flow on Tj for the given V{. Opti- 
mizing over choices of (vi) gives 
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and this leads to (|1U|) . 

A key point is that the subtrees Tj with their costs and capacities are 
isomorphic to T with its costs and capacities; and so the three processes 
(Xi(v), i = 1,2, 3) are independent and have the same distribution as (X(v)). 
Note here that the "edge-reversal" involved with e\ makes no difference, 
since our model is invariant in distribution under edge-reversal. Thus (|1U|) 
implies a recursive distribution equation (RDE) for the "unknown" distribu- 
tion of X = (X(v), v > 0), as follows. 

X = F X (X 1 ,X 2 ,X 3 ,$ 1 ,$ 2 ,$ 3 ) (11) 

where X\, X 2 , X 3 , $i, $2) <&3 are independent; the <3?j are distributed as the 
edge-cost <I>, the are distributed as X, and F\(xi(-), x 2 (-), xs(-), (fti(-), 4> 2 (-), </>3(-)) 
is the function 

3 

v -» inf ^ (4>(vi) - \vi + Xi(vi)) 

i=l 

3 

inf Y] (<j){vi) - Xvi +Xi(vi)) . 
1=1 

(Here (fi(-) denotes a typical value of $(■)•) 

Recall the construction of T + as the subtree of T on one side of the 
edge e*. Construct an opposite subtree T~ of T by again starting with the 
edge e* = (w~,w+), and now deleting the other edges at w+ to leave T~ as 
the component containing W-. So is isomorphic, under edge-reversal, to 
T+. Write (X + (v)) and (X~(v)) for the processes © on T + and T _ . Now 
consider minimizing, over flows f on the entire tree T, the quantity 

£(*(e,/(c))-A/(c)). 

eeE 

Any flow f decomposes into flows on T + and on T _ with the same value of 
v = /(e*). Minimizing the quantity above for a given value of v gives 

$(e*,u) - Xv + X + (v) + X~(v). 

Thus the optimal flow is obtained by minimizing over v, and the flow across 
e* is 

f(e*) = argmin($(e*» - Xv + X + (v) + X~(v)) . (12) 
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This optimal flow f = f\ has normalized volume and cost (section I3.1JI 




E[f(e*)} 



(13) 
(14) 



This completes the analytic arguments. We now do bootstrap Monte Carlo 
(section 13 .7(1 to numerically compute the solution of the RDE ifTT)) and then 
use 1)12113114(1 to get the numerical results presented in sections 12.11 12.21 and 
l2~5l 

3.5 Unit edge capacities 

We now turn to the specialization where each edge has unit capacity and 
the cost-per- unit-volume on an edge is constant up to volume 1: 



So the randomness is supplied only via the i.i.d. edge-costs C(e). Call a 
flow f with 



a — 1 flow. If we consider a random 0—1 flow F = (F(e)) then the 
expectations /(e) = E[F(e)] form a flow with < /(e) < 1. Conversely, any 
flow with < /(e) < 1 can be represented as the expectation of a random 
0—1 flow. It follows that in our optimization problem (jSJ we need only 
consider — 1 flows. This simplifies the mathematical structure, because 
in the RDE (|11|) we now need consider only X(l), which we re-name as X. 
Looking at (|11|) . there are only three possible values of (^1,^2,^3) for each 
case v = 0, 1: 



So (|11|) becomes a RDE for an unknown distribution of a real- valued random 
variable X: 



4>{e,v) 



= C(e) < v < 1 



= 00 v > 1. 



/(e) = or 1 for all e 



(v=l): (0,1,0), (0,0,1), (1,1,1) 
(v = 0): (0,0,0), (1,1,0), (1,0,1). 



X 



d 




(15) 
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Next, the formula (|12|) for optimal flow across e* says, in the present setting, 
that the optimal flow has unit flow across e* iff the arg min in ()12j) equals 1 
instead of 0, that is iff 

0>C(e*)-A + X + + X_ 

where X + and X~ are the independent copies of X associated with T + and 
T . Thus we get the inclusion criterion: e* is in the optimal flow iff 



C(e*) < X-X+ -X~. (16) 
So the normalized volume and cost of the optimal flow f\ are 

v(f x ) = P(C(e*) < X-X+-X-) (17) 

c(f A ) = E[C(e*)l { c(e*)<x-x + -x-)]- (18) 



As before, we can now use bootstrap Monte Carlo to solve (|15[) numerically, 
and then use (|l(ill7ll8|) to compute the curve in section 12. ^1 

3.6 Randomly obstructed networks 

To analyze the section |2~H model, recall (J^J that we can interpret it as the 
case of unit capacity edges with costs C such that 

p(C=l)=p; P(C = oo) = l-p. 

Looking back at (jSJ) we see that, because edges-costs are either 1 or oo, one 
must get the maximum volume flow for an arbitrary choice of A > 1, and 
the solution X of (|15|) should be supported on multiples of 1 — A. Examining 
p5[). we see the latter is correct. Setting X = Z{\ — 1) in (|T5|) leads to the 
RDE (not depending on A) 

Z = max (^Z 2 + B^Zz + B^Y^Zi + Bi^j 

- max ( 0, Y (Zi + Bi), ^ {Zi + B { ) J (19) 

\ i=l,2 i=l,3 J 

where Z has unknown distribution on {— oo} U Z and where (_Bj) are inde- 
pendent with P{B = 1) = p, P(B = — oo) = 1 — p. In terms of two copies 
Z + , Z~ of the solution of this RDE, (|17j) implies the formula 

v*(p) = P{Z + + Z- > -1). (20) 

As usual, we solve this numerically by bootstrap Monte Carlo to obtain the 
curve shown in Figure 5. 
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3.7 Bootstrap Monte Carlo 

The abstract structure of a RDE is 

X i g(S,X it i>l) 

where g r (-) and the distribution of £ are given, and where (Xi,i > 1) are 
independent copies of an "unknown" distribution X. Here X and £ can take 
values in arbitrary spaces. Equivalently, an RDE is a fixed-point equation 
for a map /x — ► T(p) on probability distributions, where 

r(dist(x)) = dist(< ? (£,x i , » > i)). 

The bootstrap Monte Carlo method provides a very easy to implement and 
essentially problem- independent method to seek solutions. Start with a list 
of N numbers with some empirical distribution [1q. Regard these as "genera- 
tion 0" individuals (Xf, 1<%<N). Then T(/x ) can be approximated as the 
empirical distribution /ii of N "generation 1" individuals {X} , 1 < i < N), 
each obtained independently via the following procedure. Take £ with the 
prescribed distribution, take I\, I2, ■ ■ ■ independent uniform on {1,2,..., N} 
and set 

X} = gfoXl, X%,...). (21) 

Repeating for some number G of generations lets one see whether T n (^o) 
settles down to a solution of the RDE. 

Experience with a range of RDEs indicates that taking N = 200, 000 as 
"population size" and iterating through G = 200 "generations" gives reliable 
solutions. When dist(X) is just a distribution on the real line, this procedure 
requires only 4 x 10 7 evaluations of the form (|21|) . which is computation- 
ally easy when g(-) is simple to evaluate. This is the situation for Table 1 
and Figure 5. But in the general setting of this paper, where the unknown 
distribution is of a process X = X(v), and the function g involves minimiz- 
ing over choices (vi, 1)2,1)3) as at (|11[). the computational problem becomes 
harder. Our results in Figures 2,3,5 used a crude implementation where we 
represented X via evaluation at 60 grid points (X (ui) , X (1L2) , ■ ■ ■ ,X(uqo). 
So one evaluation of (|21j) requires 60 3 steps, meaning that using the previ- 
ous values of N and G would require more than 8 x 10 12 steps. This being 
infeasible, we used smaller values of ./V and G, and the resulting "sampling 
error" is visible in the irregularities in Figures 2,3,5, where we plotted actual 
data rather than a smoothed curve. 
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4 Other analysis 



4.1 The linear case and first passage percolation 

The linear case is the case 

$(e,u) = /c(e) v, < v < oo 

where the «;(e) are i.i.d. as e varies. In other words, the cost-per-unit-flow 
4>(e, v) = n(e) does not depend on volume of flow. This has the obvious 
scaling property that if f has t>(f) = vo, c(f) = Co then a scaled flow af has 
v(af) = avo, c(af) = aco. So the network cost-per-unit-volume function 
must be of the form 

c = ip(v) = R 

where R depends on the distribution of n{e). It is easy to see that R can be 
identified with the time constant for first-passage percolation on T, that is 
the limit 

n 

n _1 min K{wi-\ , Wi) — > R a.s. as n — > oo. 

path root=10o,UIi,...,lO n ^— ^ 
i=l 

It is well known that, as a specialization of general results for branching 
random walk (cf. jS] Example 6.7.3), R can be calculated as the solution of 

inf (log Mexp(-0«;(e))] + Ok) = log 1/2. (22) 

6»>0 

However, this linear case is (from our viewpoint) degenerate in the sense 
that there is no flow on T attaining the infimum of normalized cost for 
given normalized volume. Instead, there is a sequence of flows which assign 
zero volume to most edges and assign larger and larger volumes to paths 
whose average edge-cost is closer and closer to R. Our non-linear examples, 
and our methodology for analyzing them, rest upon the idea that optimal 
flows on T are actually attained by some minimizing flow f . 

On the other hand, the linear case does tell us something about the 
low-volume limit of the general case. Suppose 

v — > 4>(e,v) is increasing; n(e) := <p(e,0+) > 0. (23) 

Then the low-volume limit of the network cost-per-unit-flow function will 
be 

V>(0+) = the solution R of (24) 
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To see why, fix e > and consider a flow in the linear case with normalized 
volume 1 and normalized cost R+e. The scaled flow with normalized volume 
v has, in the linear case, normalized cost ip(v) = (R + e)v . So the same flow 
in the general case has normalized cost ip(v) ~ (R + e)v as v J, 0, by (|2*5|) . 

4.2 The customer driven scheme 

As mentioned in section 12.51 one can consider the decentralized routing 
scheme in which customers leaving each vertex choose the cheaper out-edge 
to traverse next. Writing ei, e 2 for the out-edges at a vertex, and 4>{ei,Vi) for 
the cost-per- unit-volume functions on those edges, the effect of this customer 
driven scheme is to adjust the flows v% so that these marginal costs are equal. 
That is, if the total in-flow equals v, then the out-flows v±,v 2 are determined 
by 

0(ei,«i) = 0(e 2 ,u 2 ); v l + v 2 = v. (25) 

We can study the resulting flow in our infinite-network model, though as 
noted earlier it is not so clear how results pull back to the finite network 
model. Given v and 0i(-),02( - ); consider the solution (vi,v 2 ) of the analog 
of the equation above: 

<f>l(v\) = 4> 2 {v 2 )\ vi+v 2 = v. (26) 

Write 

T(v,(f>l,(/>2) = vi 
W(v,(f)i,(f>2) = 4>i (vi). 

It is clear that the flow Y across a typical edge will satisfy the RDE 

Y = T{Y 1 +Y 2 ,<h,<h) (27) 

where 4>i(-) denote independent choices from the random cost-per-unit- volume 
function (p(v) = <&(v)/v in the model description. To see l|27|l. note that the 
flow into a typical vertex is distributed as Y\ + Y 2 , so that T(Y\ + Y2, <f>i, <j> 2 ) 
represents the flow along one out-edge. 

Analogous to the curve c = ^f(v) giving the normalized cost-volume 
relationship for optimal flow through the infinite network, there is a curve 
c = G{v) giving the normalized cost-volume relationship for the customer 
driven scheme. We expect (|27JI to have a one-parameter family of solutions 
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corresponding to different values of -E^]. In terms of these solution, the 
curve is 



v = E\Y]=E[T(Y 1 +Y 2 ,<h,h)] (28) 
c = G(v) = E[T(Y 1 +Y 2 ,<l> 1 ,<h)W(Y 1 + Y2,<l>i,<h)] (29) 

because the flow of volume Y = T{Y\ + Y 2 , 4>i, 4> 2 ) along the typical out-edge 
has cost-per-unit- volume equal to W(Yi + Y 2 , (pi, (j> 2 ). 

We now specialize to the "quadratic cost" case of section 12.51 Here 

4>i(v) = KiV 

and so we can solve ((2*S|) to get 

T(v,4>i,cf) 2 ) ■■ 
W(v,(f>i,(f) 2 ) ■■ 

The RDE gZJ becomes 



K 2 V 



K\ + K 2 

K\n 2 v 

K\ + K 2 



Y = (Y 1 + Y 2 ). (30) 

Kl + K 2 

Specializing H28I29|) we see that the network cost-volume curve will be 

c = G(v) = gv 2 



g = E 



^- 2 {Y 1 + Y 2 f 



(31) 



where Y is the solution of ()3flj) with E[Y] = 1. Note the random variables 
in (|31|) are all independent. 

We now consider the special case where the distribution of k is Gamma(a, a) 
for some < a < 00. We will show (as stated in section T2.5|) that in this 
case g = 1. Recall the Gamma(a, a) distribution has mean 1 and variance 
\ja. It is a classical fact that 

and K\ + k 2 are independent. (32) 



Kl + K 2 



It follows that the solution Y of (|3*U|) is the same Gamma(a, a) distribution, 
because for such Y 

" 2 -(Y 1 + Y 2 ) I ^^( Kl + K2 ) = n 2 £ Y 



Kl + K 2 Kl + K 2 
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It remains to evaluate g. First observe 



E[{Yi + Y 2 f] = 4 + 2 var (Y) = 4 + 2a 



-i 



Next, writing; 3 = — ^ — and A 
variables in (|32jl. we can write 



Ki + K2 for the independent random 




ki/3 2 = A(l -/3)/? 2 . 



(«1 + K 2 ) 2 



Since £7 [A] 



2 we can insert into to get 



g = (8 + 4a" 1 )£[/? 2 -/3 3 ]. 



But 3 has the Beta(a, a) density 



f{x) = x a ~ x {l - x) a - l T(2a)/T 2 {a), < x < 1 



where T(-) is the gamma function. The k'th moment of this distribution 
equals ^^^7^ > anc ^ a Dr i e f calculation shows 



so that indeed g = 1. 

4.3 Endpoint behavior in the randomly obstructed network 
model 

The endpoint behavior observed numerically in Figure 5 is not too hard to 
understand theoretically in the infinite tree model, as we now outline. First 

we assert 



where 6{p) is the non-extinction probability for the Galton- Watson branch- 
ing process with Binomial(2,p) offspring. Recalling that we need only con- 
sider — 1 flows, this is clear because in order to have a unit flow through e, 
we need e itself to be non-obstructed (probability p) and we need there to 
exist infinite non-obstructed paths starting from each end-vertex of e (prob- 
ability 9{p) each). An elementary calculation gives 9(p) ~ 8(p — 1/2) as 
p I 1/2, and so we deduce 



E[3 2 -8 3 } 



a 



4(2a + 1) 



v*(p) < pB 2 {p) 



v*(p) < (32 + o(l))(p-i) 2 asp | |. 
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Turning to the case where p is close to 1, dual to the optimal flow is the 
complementary "0-flow" consisting of the set of edges with no flow; this set 
must make an edge-disjoint collection of doubly infinite paths containing 
every obstructed edge. By considering where the obstructed edges appear 
in this 0-flow it is easy to see the identity 

1 i_p P ^ = mean number of edges traversed in the optimal 0-flow, 
starting at a typical obstructed edge, until the next obstructed 
edge is reached. 

Now this mean is > £7[M] where M is the mean number of edges, starting 
at the root of T and following directed edges, needed to reach the closest 
obstructed edge. Because there are 2 + 2 2 + . . . + 2 m_1 edges at distance 
< m we see 

oo oo 

E[M] = P(M >m) = J2 p 2+22+ - +2m " 1 = log 2 ^ ± 0(1) as p | 1. 

m=l m=l 

This shows 

1 - v*(p) > (log 2 - 0(1))(1 -p) as p T 1. 

These arguments give the easier directions of inequalities, but proving 
complementary bounds 

v*(p) > a Q (p — i) 2 as p I \ (for some ao > 0) 

1 — v*(p) < ai(log 2 — p) as p | 1 (for some a\ < oo) 

is surely within the scope of known methods of theoretical probabilistic 
combinatorics, though we have not tried to write down details. 



5 Discussion 



5.1 Other underlying graph models 

The calculations go over with only straightforward changes to any model 
which is "locally tree-like" in the sense of local weak convergence to some 
limit infinite (maybe random) tree. Such models include 

(i) the classical Erdos-Renyi random graph model [31 EI] (more precisely, 
the giant component in the sparse supercritical regime); 

(ii) recent "complex network" models designed to have power-law degree 
distributions |16j . 

On the other hand, models which pay attention to Euclidean geometry of 
vertex positions , such as random geometric graphs |17j . are not locally tree- 
like and rarely permit analytical derivation of exact limit formulas. 
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5.2 Other flow and routing problems 

From the algorithmic viewpoint, finding optimal routes through a realization 
of the random layer network with cost-volume functions on each edge is not 
easy. It is therefore remarkable that one can give a theoretical analysis 
of costs of the optimal routing without any consideration whatsoever of 
algorithmic issues! Of course, our focus on the global optima is unrealistic, 
and it would be interesting to use our models as a testbed for comparative 
analysis of different distributed routing algorithms. 

Acknowledgement. We thank Frank Kelly for helpful discussions. 
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